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Abstract 


The purpose of a cryptosystem is to allow people to communicate securely 
over an open channel. Before one can discuss whether a cryptosystem 
meets this goal, however, one must first rigorously define what is meant by 
security. 


Three very different formal definitions of security for public-key cryp- 
tosystems have been proposed—two by Goldwasser and Micali and one by 
Yao. In this thesis, it is shown that the three definitions are essentially 
equivalent. 


As originally proposed, the three definitions are not equivalent. The 
inequivalence, however, is caused only by some minor technical choices. 
After rectifying those choices, we prove all three definitions to be equivalent. 
This equivalence provides evidence that the right formalization of the notion 
of security has been reached. 


Portions of this thesis represent joint work with Silvio Micali. 
Thesis Supervisor: Prof. Silvio Micali 


Title: Associate Professor of Computer Science 


This research was supported by a General Electric Foundation Fellowship, NSF grant 
DCR-8509905, and by a National Science Foundation Fellowship. 


Acknowledgments 


I have been extremely fortunate to have Silvio Micali as my research ad- 
visor. He has been an outstanding source of ideas and guidance. The 
technical content of this thesis has benefitted greatly from his insights; the 
prose in this thesis has benefitted greatly from his abundant help in editing. 


I have also benefitted from numerous technical discussions with Shafi 
Goldwasser. I want to thank her for her insight and helpfulness. 


I'd like to thank Ravi Boppana for his help with Chernoff bounds, and 
Tom Cormen for his help in navigating the typesetting program. 


~ 


Contents 


Introduction 

1.1 Public-Key Cryptography .............0000804 
1.2 The RSA Public-key system ...........2.-0+02008. 
1.3. Probabilistic Encryption: «00. 4.6 406 608 won) ee a 
Lie SSC Uriy: a oie 5 te eo ete gee Oe el RY ek Oe ge a 


Notation and Public-key Scenarios 

2.1 Notation and Conventions for Probabilistic Algorithms. 

2.2 Gryptopraphice Scenarios: fc eg sow x a ge oe ee 
2.3 Passes 


a 


Definitions of Security 


$1 Anformal Discussion ::.<, 452 foc ie Salk Oe oS Ge RS 
3.2 GM-security (3-pass).........., eer aa eae 
3.3 Semantic Security (3-pass)... 2... 2... ee ee eee 
34. -“Y-seturity (3-pase |) 4c 6 ial d acelin ae ein Ba Gah eee 
3.5 The original definitions vs. mine ............0.4. 
3.6 Inequivalence of the original definitions. ........... 


5 


oO on 


6 CONTENTS 


4 Main Results 24 
4.1 Semantic Security Implies GM-security............ 24 
4.2 -Y-security implies GM-security ......... nee eas 25 
4.3 GM-security implies Y-security ................ 26 

5 One-Pass Scenarios 30 
5.1 GM-security (l-pass). 2... .0............00.. 30 
5.2 Semantic Security (1-pass)................... 31 
5.3 Y-security (l-pass)...............0000005. 32 
5.4 Equivalence ie ecenceratoy tray. Getce ve theresa ta) lar 36 Jee eiehags AD aT 32 


Chapter 1 


Introduction 


- 


1.1 Public-Key Cryptography 


The era of modern cryptography began with Diffie and Hellman’s famous 
1976 paper [4] which presented the concept of public-key cryptography. 
Informally, there is a community of users, A, B, ... In the Diffie-Hellman 
paradigm, each user U in the system selects a pair of encryption/decryption 
algorithms (Hy, Dy) such that for all z, Dy (Ey(z)) = z. User U publishes 
Ey (in an ad hoc public file) but keeps Dy secret. Any other user, in order 
to send securely a binary string m to U, first looks up Ey, then computes 
y = Ey(m), and finally sends y to U. 


Diffie and Hellman insisted that such a system be secure against any ad- 
versary who wiretaps the communications channels, intercepts the cypher- 
text y and tries to compute Dy(y). Note that the concern here is only 
with passive adversaries; our adversary is not, for example, allowed either 
to alter the messages sent or to inject his own messages into the system. 


The security of cryptosystems based on the Diffie-Hellman model, and in 
fact the security of every cryptosystem I will discuss in this thesis, is based 
on complexity theory. That is to say, statements such as “No adversary 
can extract the plaintext” or “No adversary can compute any information 
about the plaintext” really mean that it is computationally infeasible for 
an adversary to do such a thing. 
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The first concrete implementation of a cryptosystem based on Diffie and 
Hellman’s idea was the RSA scheme of Rivest, Shamir, and Adleman [8]. 
A brief description of their cryptosystem is given in below. 


1.2 The RSA Public-key system 


Alice’s Preprocessing steps: 


Le 


Find two random distinct primes py, po. 


- 


. Compute n = pipe. 


Compute y(n) = (p, — 1)(p2 — 2). 


Compute s,t such that st = 1 mod y(n). Thus (s,y(n)) = 1. (Given 
a modulus m, there is a fast algorithm for computing multiplicative 
inverses mod m. Hence all we need to do is pick s at random from 
Zin) and then compute its inverse.) 


. Publish s,n in some public file (“phone book”), but keep ¢ secret. 


Instructions for Bob: 


. Bob has a message m = “Hi! ...” that he wishes to send to Alice. 


First he must of course represent m as a binary number in some 
agreed upon way. 


. Bob then computes y = m’* mod n and sends y to Alice. 


Instructions for Alice: 


Alice can easily recover the plaintext by computing m = y’ mod n. 
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1.3 Probabilistic Encryption 


The RSA scheme—and indeed, any cryptosystem following the Diffie-Hell- 
man model—is deterministic. That is to say, any given plaintext message 
has a unique encryption. As Goldwasser and Micali pointed out [5], dis- 
cussing the security of a deterministic public-key cryptosystem is a tricky 
business. For instance, a deterministic public-key cryptosystem cannot be 
used to send securely a given small set of messages, say {0,1,...,10}. In 
fact, any “code breaking algorithm” may, on input EH and a cyphertext y, 
check first whether E(1) = y for? =0,...,10. 


Also, even when the tapper is not capable of retrieving m from E and 
E(m), hé may be able to compute some partial information about m, say 
the value of P(m) for some predicate P. From this point of view, if messages 
are deterministically encrypted, then an adversary can always extract some 
information about the plaintext from the cyphertext. At the very least, an 
adversary can easily compute, given only E and the cyphertext, y, the 
following predicate Pz : 

O if the last bit of the encryption E(m) is 0 
sO an 1 if the last bit of the encryption pi is 1. (1.1) 


We are not claiming that Pr is an interesting predicate. We are simply 
pointing out the difficulties of discussing the security of deterministic cryp- 
tosystems. 


To prevent an adversary from computing even such partial information 
about the plaintext from the cyphertext, Goldwasser and Micali suggested 
using probabilistic encryption algorithms. In other words, one may think of 
the encryption algorithm as an algorithm with two inputs, E = E(-,-), the 
message to be transmitted and a random string (selected by the sender). If 
one chooses a probabilistic encryption algorithm properly, then every plain- 
text message will have many different encryptions, but a given cyphertext 
will still be the encryption of only one plaintext message. 


This choice also allowed Goldwasser and Micali to introduce rigorous 
and convincing notions of security. Changing the scenario from determin- 
istic to probabilistic becomes necessary as their security conditions cannot 
be met by any deterministic cryptosystem. 
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1.4 Security 


The key desideratum for any cryptosystem is that encrypted messages must 
be secure. Before one can discuss whether a cryptosystem has this prop- 
erty, however, one must first rigorously define what is meant by security. 
Three different rigorous notions of security have been proposed. Gold- 
wasser and Micali [5] suggested two different definitions, polynomial se- 
curity and semantic security, and proved that the first notion implies the 
second. Yao [11] proposed a third definition, one inspired by information 
theory, and suggested that it implies semantic security. 


Not completely knowing the relative strength of these definitions is 
rather unple&sant. For instance, several protocols have been proved correct 
adopting the notion of polynomial security. Are these protocols that are 
secure with respect to that particular definition, or are they secure in a 
more general sense? In other words, a natural question arises: Which of 
the definitions is the “correct” one? Even better: How should we decide 
the “correctness” of a definition? 


The best possible answer to these questions would be to find that the 
proposed definitions—each attempting to be as general as possible—are all 
equivalent. In this case, one obviously no longer has to decide which one 
definition is best. Moreover, the equivalence suggests that one has indeed 
found a strong, natural definition. 


In this thesis, I will show that the three notions are essentially equiva- 
lent. The three originally proposed definitions were not equivalent. How- 
ever, as I will point out, this inequivalence was caused only by some minor 
technical choices. We can prove, after rectifying these marginal choices, the 
desired equivalences and keep the spirit of the definitions intact. 


Chapter 2 


Notation and Public-key 
Scenarios 


2.1 Notation and Conventions for Probabi- 
listic Algorithms. 


The notation I present here is almost identical to that introduced by Gold- 
wasser, Micali, Rivest [6]. 


I emphasize the number of inputs received by an algorithm as follows. If 
algorithm A receives only one input I write “A(-)”, if it receives two inputs 
“A(-,-)” and so on. 

“PS” will stand for “probability space”; in this paper we only consider 
countable probability spaces. In fact, we deal almost exclusively with prob- 
ability spaces arising from probabilistic algorithms. 


If A(-) is a probabilistic algorithm, then for any input 1, the notation 
A(t) refers to the PS which assigns to the string o the probability that A, 
on input 2, outputs o. Notice the special case where A takes no inputs; in 
this case the notation A refers to the algorithm itself, whereas the notation 
A() refers to the PS defined by running A with no input. If S is a PS, 
denote by Prs(e) the probability that S associates with element e. Also, 
we denote by [S| the set of elements which S gives positive probability. In 
the case that [S] is a singleton set {e} we will use S to denote the value e; 
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this is in agreement with traditional notation. (For instance, if A(-) is an 
algorithm that, on input 7, outputs 7°, then we may write A(2) = 8 instead 
of [A(2)] = {8}.) 

If f(-) and g(-,---) are probabilistic algorithms then f(g(-,---)) is the 
probabilistic algorithm obtained by composing f and g (ie. running f 


on g’s output). For any inputs z,y,... the associated probability space is 
denoted f(g(z, y,...)). 


If S is any PS, then z «+ S denotes the algorithm which assigns to z an 
element randomly selected according to S; that is, z is assigned the value e 
with probability Prs(e). If F is a finite set, then the notation z + F denotes 
the algorithm which assigns to z an element randomly selected from the 
PS which has sample space F and the uniform probability distribution on 
the sample points. Thus, in particular, z ~ {0,1} means z is assigned the 
result of a coin toss. 


The notation Pr(p(z,y,...)| z+ S;y<T;...) denotes the probabil- 
ity that the predicate p(z,y,...) will be true, after the ordered execution of 
the algorithms z+ S, y + T, etc. I use anaiogous notation for expected 
value—Ex(f(z,y,...) | 2 — S;y <— T;...)—where now f is a function 
which takes numerical values. 

Let RA denote the set of probabilistic polynomial-time algorithms. I 
. assume that a natural representation of these algorithms as binary strings 
is used. 


By 1” we denote the unary representation of integer n, i.e. 


Pieced 
— 


n 


2.2 Cryptographic Scenarios 


Here I specify those elements that are necessary for all public-key cryptog- 
raphy. 


A cryptographic scenario consists of the following components: 


e A security parameter n which is chosen by the user when he creates 
his encryption and decryption algorithms. The parameter n will de- 
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termine a number of quantities (length of plaintext messages, overall 
security, etc.). 


e A sequence of message spaces, M = {M,,} from which all plaintext 
messages will be drawn. M,, consists of all messages allowed to be sent 
if the security parameter has been set equal to n. In order to make our 
notation simpler, (but without loss of generality), we'll assume that 
M,, = {0,1}". There is a probability distribution on each message 
space, Pr, : M, — [0,1] such that So cay, Prn(m) = 1. 


e A public-key cryptosystem is an algorithm C € RA that on input 1” 
outputs the description of two polynomial-size circuits E and D such 
that: 


1. E has n inputs and /(n) outputs, and D has I{n) inputs and 
n outputs. (Il is some polynomial that gives the length of the 
cyphertext.) 


2. E is probabilistic; D is deterministic. 
3. Forallm € &", Pr(D(a) = m|(E,D) — C(1"); a -— E(m)) =1. 


Notice that [E'(m)] is a set which is typically quite large. Our notation 
requires us to write a € [E(m)] to refer to a, a particular encryption 
of m. Nevertheless, we will sometimes sloppily write E(m) for a 
particular encryption of m when the meaning is clear. 


e The number of “allowed passes.” This number specifies how A and 
B agree upon an encryption algorithm E output by the public-key 
cryptosystem. To this crucial notion (surprisingly neglected so far), 
we devote the next section. 


2.3. Passes 


Within the public-key model, A and B can alternate communicating back 
and forth as many times as they feel are necessary to achieve security. Call 
each alternation a pass. 


Any number of passes are, of course, permissible. I concentrate on what 
I believe are the two most interesting and important cases, one and three 
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passes. I do not consider more than three passes, because, if trapdoor 
permutations exist, a well designed probabilistic encryption scheme can 
achieve as much security as is possible using only three passes. 


Three-pass systems 


The three-pass case is, perhaps, the most natural to think about. It cor- 
responds to a telephone conversation. A has a message m that she wants 
to securely communicate to B. A calls up B and says, “I have a message 
‘T’d like to send to you.” B, so alerted, proceeds to generate an encryp- 
tion/decryption algorithm pair, (Z,D), and tells A, “Please use E to en- 
crypt your message.” A then uses E£ to encrypt her message and tells B 
“E(m) 2 


Notice the key property of a three-pass system: The message and the 
encryption algorithm are selected independently of one another. We are 
nevertheless in a public-key model, since anyone tapping the phone line 
gets to hear B tell E to A. 


One-pass systems 


_ A one-pass system corresponds to what is commonly called a public file 
system. In the one-pass model, A simply looks up B’s public encryption 
algorithm, EF, in a “phone book” and uses it to encrypt her message. (One 
pass is a slight misnomer. At some point, in what we may view as a 
preprocessing stage, B must have communicated his encryption algorithm, 
presumably by telling it to whomever publishes the phone book of encryp- 
tion algorithms, and thus indirectly to A. “One and a half passes” might 
be more accurate. “Half” refers to the preprocessing stage that needs to 
be performed only once.) In this case, the choice of message can depend 
on £. 


Chapter 3 


Definitions of Security 


- 


3.1 Informal Discussion 
The main result of this thesis is 


GM-security, semantic security, and Y-security (all formally de- 
fined later in this chapter) are equivalent for both three-pass 
and one-pass cryptosystems. 


Interestingly, the equivalence still holds in the one-pass scenario, but the 
notions of security vary between the one-pass and three-pass scenarios. This 
point has not been given the proper attention, because people frequently 
confuse the notion of one-pass public-key cryptography with public key 
cryptography in general. 


The distinction, however, is crucial for avoiding errors, particularly in 
cryptographic protocols. Let us informally state the two definitions of se- 
curity that are achievable in the two scenarios if trapdoor permutations 
exist. 


3-pass A cryptosystem is secure if, for every message m in the message 
space, it is impossible to efficiently distinguish an encryption of m 
from random noise. 
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1-pass A cryptosystem is secure if, for every message m that is efficiently 
computable on input the encryption algorithm alone, it is impossible 
to efficiently distinguish an encryption of m from random noise. 


In other words, in the one-pass scenario one cannot just blithely write, “For 
all messages m.” For instance, if one closely analyzes all known public-key 
cryptosystems, it is concetvable that if (HZ, D) is an encryption/decryption 
pair, then D can be easily computed from E(D). For instance, the con- 
structive reduction of security to quadratic residuosity given by Goldwasser 
and Micali [5] for their cryptosystem would vanish if the encrypted message 
is allowed to be D itself. 


Such problems cannot arise in the three-pass scenario because the en- 
cryption algorithm E is selected after and independently of the message 
m. 


In this thesis, we will only prove the desired equivalences in detail for 
the three-pass scenario. The proof for the one-pass scenario is sketched 
in the final chapter. The reason for this choice is that the definitions of 
security are much more easily stated for three-pass systems. It is much 
more convenient to say, “For all messages m,” than “For all messages m 
that are efficiently computable given the encryption algorithm as an input.” 


3.2 GM-security (3-pass) 


This definition is essentially what Goldwasser and Micali [5] called polyno- 
mial security. 


A line tapper is a family of polynomial-size probabilistic circuits T = 
{T,}. Each T,, takes four strings as input and outputs either 0 or 1. How- 
ever, to make our next equation more readable, we will treat T,,’s output 
as being either its second or third input (0 or 1 respectively). 


1Notice that if Bob publishes an encryption algorithm E in the public file while keeping 
its associated decryption algorithm D secret, then any other user, being limited to efficient 
computation and ignorant of D, necessarily selects her message m efficiently from the input 
E—maybe without even looking at H—and perhaps other inputs altogether independent 
of (E, D). However, in designing cryptographic protocols, one would often like to be able 
to transmit things like E(D). For instance, if that type of message were allowed, one 
would have a trivial solution to the problem of verifiable secret sharing [3]. 
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Definition Let C be a public-key cryptosystem. C is GM-secure if for 
all line tappers T and c > 0, for all sufficiently large n, for every mo, m, € 


{0,1}" 


Pr(Ta(E,mo,m1,a) = m| m+ {mo,m};B— (1%); @— E(m)) < 5 +0, 
(3.1) 


Remark: In reading the above definition, one should pay close attention 
to our notation. Upon casual consideration of 3.1, one might conclude that 
there aren’t any GM-secure cryptosystems! After all, the definition says 
that the encryption E must be secure against any mo and mj, both of which 
are given as inputs to the line tapper. What happens if we put mp = D, 
a description of the decryption algorithm? The answer to this question 
is that our notation specifies that first we choose m from {mo,mj} (and 
thus mg and m, already had been set), and then we choose our encryption 
algorithm. If C is GM-secure, then the probability that C(1") assigns to 
any given output is quite small, say O(2~"). Thus there’s little worry that 
C will just happen to output a decryption algorithm D = mo. Notice how 
the above definition (via our notation) models the three-pass scenario. 


3.3. Semantic Security (3-pass) 


Again, this definition is essentially the same as in [5]. It can be viewed 
as a polynomial time bounded version of Shannon’s “perfect secrecy” [10]. 
This definition makes use of the probability distributions Pr, over the sets 
of messages M,. Informally, let f be any function, f : M — V = {any 
values the adversary likes}. Intuitively, f should be thought of as some 
information about the plaintext that the adversary would like to be able to 
compute from the cyphertext—say the first seventeen bits of the plaintext. 
A cryptosystem is semantically secure if no adversary, on input E(m) can 
compute f(m) more accurately than by random guessing. 


Definition Let C bea public-key cryptosystem, M = {M,} a sequence 
of message spaces, and V be any set. Let ¥ = Ss :M,-V|EE [c(a")]} 


be any set of functions. For v € V, we denote by f” ~*(v) the inverse image 
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of v; that is, the set {m EM, | f2(m) = v}. Then the probability of the 


most probable value for f(m) is p¥ = max (Soc peal) Pr,(m) | ve v}. 
p= is the maximum probability with which one could guess f#(m) without 
having any idea whatsoever what m is. 


C is semantically secure if for every family of polynomial-size proba- 
bilistic circuits A = {A,(-,-)}, for all ¢ > 0, and for all sufficiently large 


Pr( A,(E, a) = f2(m) | m+ M,; E—C(1"); a E(m)) < pe +n. 
(3.2) 


3.4 Y-security (3-pass) 


Yao’s definition [11] is inspired by information theory, but its context differs 
from classical information theory in that the communicating agents, A (lice) 
and B(ob), are limited to probabilistic polynomial-time computations. 


An intuitive explanation of Yao’s definition is the following: A has a 
series of n* messages, selected from a probability space known to both A and 
B, and an encryption of each message. She wishes to transmit enough bits 
to B so that he can (in polynomial time with very high probability) compute 
all the plaintexts. A cryptosystem is Y-secure if the average number of bits 
A must send B is the same regardless of whether B possesses a copy of the 
cyphertext. 


I now make this notion precise, first by defining “Alice and Bob,” and 
then eventually defining Y-security itself. 


Let M = {M,} be a sequence of message spaces. Each M,, is {0,1}" 
with a fixed probability distribution. (Note that an information theorist 
would consider M to be a sequence of sources.) 


Let e(n) be any function that vanishes faster than n~ for all positive c. 


For the sake of compactness of notation, the expression m will denote 
a particular series of n* messages. That is, m stands for m,m2,...,%nt- 


3.4. Y-SECURITY (3-PASS) 19 


Let f be any positive function such that f(n) <n. Intuitively, f(n) is 
the number of bits per message that A must transmit to B in order for B 


to recover the plaintexts. Recall that all the messages in M, have length 
n. 


Definition An f(n) c/d pair (c/d for compressor/decompressor) for M 
is a pair of families of probabilistic polynomial-size circuits, ({A,}, {Bn}), 
satisfying the following three properties for some ‘constant k and all suffi- 
ciently large n: 


1. “B, understands A,,.” 


Pr(m =y|m,+ My; ...51mMat — Mn; B — An(m); (3.3) 
y+ B,(8)) = 1-—O(e(n)). 


2. “A, transmits only f(n) bits per message.” 
Ex al | my — My; ...3 Mat — Mn; Be Ant) <f(n). (3.4) 


3. “The output of A, can be parsed.” 
For all polynomials Q there exists a probabilistic polynomial-time 
Turing machine S®? such that S? takes as input m and a concatenated 
string of Q(n) @s, each of which is a good output from A,, and 
separates them. That is, its input is 6,62... fan) and its output is 
itt Bait .-.d# a(n). We require that 


Pr(S° correctly splits B;B2... Bain) = 1 — O(e(n)). (3.5) 


Remark: The requirement that S° exist is a technical requirement. It 
creates a finite analogue of classical information theory’s requirement that 
messages be transmitted one bit at a time, in an infinite sequence of bits. 


We say that the cost of communicating M 1s less than or equal to f(n), 
in symbols C(M) < f(n), if there exists an f(n) compressor/decompressor 
pair for M. 


We define C(M) > f(n) to be the negation of C(M) < f(n)—that is, 
any circuits “communicating M” must use at least f(n) bits. The definition 
of C(M) = f(n) is analogous. 
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Let C be a cryptosystem. We define C(M | Ec(M)), the cost of com- 
municating M given encryptions from C in a manner analogous to C(M). 
The only difference is that now both A, and B, also get E and the n* 
values of some encryption function EZ € [C(1")] as inputs. (We now call 
({An},{Bn}) a shared cyphertext c/d pair.) That is, for this definition we 
must rewrite Equation 3.3 above to read: 

Pr(m = y | mi — My; ...3mnae — Mn; EF — C(1"); (3.6) 

ay + E(my); ...5 One E(m,+); 
B—A,(E,m,&);y— B,(£,8,a)) = 1—O(e(n)). 
An analogous change must also be made to Equation 3.4. 


Notice that for this definition, the probabilities involved must be taken 
over the different choices of E from C as well as everything else. 


Definition Let C be a public-key cryptosystem. Fix a sequence of mes- 
sage spaces M = {M,,} (and thus the probability distribution on each M,). 
We say that C is Y-secure with respect to M if 


C(M) = O(M | Ec(M)) + O(€(n)). (3.7) 
We say that C is Y-secure if for all M,C is Y-secure with respect to M. 


3.5 The original definitions vs. mine 


As I discussed in Chapter 1, I made minor changes in the cryptographic 
scenario from [5] and [11]. Here I will spell out those changes are and why 
they were made. 


Changes to Goldwasser and Micali’s Definition 


There are two ways a cryptosystem (the server that generates encryp- 
tion/decryption algorithm pairs) can achieve security: 


1. The cryptosystem gets a description of a message space M (and thus 
its probability distribution) as one of its inputs and will output an 
encryption/decryption algorithm pair to securely encrypt M. 
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2. The cryptosystem is told nothing about the message space. The en- 
cryption algorithms it outputs are supposed to be secure for every 
possible message space. 


We will call the former cryptosystems aware and the latter oblivious. 


Goldwasser and Micali consider aware cryptosystems for both of their 
definitions of security [5]; Yao doesn’t make it clear which type of cryptosys- 
tem he is assuming for his definition of security {11]. I believe it makes more 
sense to consider oblivious cryptosystems, for both theoretical and applied 
reasons. 


The theoretical reason for preferring oblivious cryptosystems is that all 
three definitions of security are equivalent. (See Chapter 4.) This is a 
desirable property that fails to hold for aware cryptosystems, as we will 
show in the next section. 


The practical reason for preferring oblivious cryptosystems is that, al- 
though it is certainly conceivable that having knowledge of the message 
space would allow one to design a better encryption algorithm, cryptogra- 
phers have in fact normally tried to design cryptosystems that are secure 
for all message spaces. For example, consider the cryptosystem based on 
arbitrary trapdoor predicates proposed by Goldwasser and Micali [5]. Al- 
though they only considered security in the aware cryptosystem sense, their 
cryptosystem is in fact secure in the stronger, oblivious sense. 


Changes to Yao’s Definition 


In [11], Yao assumes deterministic private key cryptography, but the defi- 
nition is immediately extended to probabilistic public-key cryptography. 


Yao defines the compressor A and decompressor B to be Turing ma- 
chines, not circuits. I have switched to circuits because it is not clear that 
there are any secure cryptosystems with respect to probabilistic Turing ma- 
chines. It might be that one can always achieve greater polynomial-time 
compression given the cyphertext simply because having a shared random 
(enough) string (in this case the cyphertext!) helps. If it does help, how- 
ever, having made the compressor and decompressor nonuniform circuits, 
we can always hardwire in a shared random string of bits. 
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3.6 Inequivalence of the original definitions 


In this section, we point out that, for aware cryptosystems, GM-security 
is a notion stronger than either semantic security or Y-security. We do 
this in the following two claims, each supported by an informal argument. 
These claims can be easily transformed to theorems after formalizing the 
discussed security notions in terms of aware cryptosystems, a tedious effort 
once we have realized that the aware setting is not the “right” one. 


Claim 1 If any GM-secure aware public-key cryptosystem exists, then there 


eztst aware public-key cryptosystems that are semanttcally secure but not 
GM-secure. * 


Let C(:,-) be any GM-secure (and thus semantically secure) aware cryp- 
tosystem. We’ll construct a C’(-,-) that is still semantically secure, but is 
not GM-secure. 


C' behaves identically to C for all message spaces, except for the message 
space {0,1}” with uniform probability distribution. In this case, C’ runs C 
to compute an encryption algorithm E, and then outputs the algorithm EF’ 
defined by: 

0” if s = 0" 
Eisai. toe 1" (3.8) 
E(z) otherwise 


C' is clearly not GM-secure, because, for the special message space 
described above, there are two messages, 0” and 1", that are easily dis- 
tinguished by their encryptions. However, C' is still semantically secure. 
Those two messages have such a low probability weight that they won’t 
give an adversary any significant advantage—on average—in computing a 
function of the plaintext on input the cyphertext. i) 


Claim 2 If any GM-secure aware public-key cryptosystem exists, then there 
exist aware public-key cryptosystems that are Y-secure but not GM-secure. 


We construct exactly the same C' as we did for the previous claim. C’' is 
of course not GM-secure. However, the two “weak messages” have such 
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Main Results 


In this chapter we provide the proof of the equivalence of GM-security, se- 
mantic security, and Y-security. We choose to do these proofs by showing 
that GM-security is equivalent to Y-security and that GM-security is equiv- 
alent to semantic-security. We present here only three of the four necessary 
implications. The proof that GM-security implies semantic security may be 
found in [5]. We’ll present the three proofs in order of increasing difficulty 
and technical complexity. 


4.1 Semantic Security Implies GM-security 


This proof is quite simple. If a cryptosystem is not GM-secure, then there 
exist two messages, m, and mz, which we can easily distinguish. If we make 
a new message space in which these are the only messages, then given only 
cyphertext, one has a better than random chance of figuring out which of 
the two plaintext messages the cyphertext represents. 


Theorem 1 Let C be a public-key cryptosystem. If C 1s semantically se- 
cure, then C is GM-secure. 


Proof Again we prove the contrapositive. Let C be a public-key cryp- 
tosystem that is not GM-secure. We will prove that C is not semantically 


secure. 
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Formally, C is not GM-secure means that there exist a line tapper T 
and ac > O such that for infinitely many n there are m},m? € M,, for 
which 


Pr(Ta(E, mf, mf,a) =m] m+ {mf,mf}; E+ C(I"); a Blm)) > 5 +=. 
(4.1) 
We construct a new message space M,, as follows: For those n for which 
equation 4.1 holds, Pr,(m?) = 1/2 and Pr,(m}) = 1/2. 


We’ve set up the message space so one can simply guess the plaintext 
by seeing the cyphertext. More precisely, a circuit guessing the plaintext 
from the cyphertext can use T as a subroutine and thus obtain a polynomial 
advantage. On the other hand, without seeing the cyphertext, circuits with 
no input can only randomly guess the plaintext. Q.E.D. 


4.2 Y-security implies GM-security 


In the proof of the next theorem, we use a technical lemma that is a varia- 
tion of Chernoff’s bound [2]. The derivation of the lemma from Chernoff’s 
bound is in the appendix. 


Lemma 1 Let X be a random variable having binomial distribution, with 
n trials and probability of success p. ForO<a<1/2<p<1, we have 
Pr(X < an) < e72(P-2)’n, 


Theorem 2 (Rackoff [7]) Let C be acryptosystem. If C is Y-secure, then 
C is GM-secure. 


Proof Again we will prove the contrapositive. Let C be a cryptosys- 
tem that is not GM-secure for some message space M. There exists a 
family of line tappers T = {T,} such that for infinitely many n, there 
are mj,m? € M, such that T, can distinguish between them. Consider 
now a new message space M’ that, for those n, has Pr,(mg) = 1/2, 
Pr,(m?) = 1/2, and Pr,{m) = 0 for all other m € {0,1}”. 
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Clearly C(M') = 1: any circuits not sharing cyphertext will need one 
bit per message to communicate outputs from M’'. This fact follows from 
classical information theory considerations. 


On the other hand, we will now show that C(M' | E;(M')) <1-1/n*. 
(The value of the constant k will be specified below.) This value is achieved 
by a shared cyphertext c/d pair that transmits n* messages at a time. 


A, gets n* messages in both plain and cleartext as its input. Since 
there are only two messages in M/, each message can be considered to 
be a bit 6 and each cyphertext the encryption of a bit. That is to say, 
A,’s input is b;,b2,...,b,%,, 01, @2,-.-,Q@,e where a; € [E(};)]. A, now 
XORs each adjacent pair of messages (bits). That is, put c; = 6; ® bj41 for 
i= 1,2,...ynF—1. Put B = eyceq:++c,4_1. This @ is the “hint” that A, 
sends to B,. Obviously, |B|/n* = 1—1/n*. 


Now, can By, given § and the a;s as its input, determine the plaintext 
with probability 1 — O(e(n))? Yes. The “hint”, 3, constrains B, to only 
two possible choices of values for the 6;. That is, if B, decides that 5, = 0, 
then it knows the value of all the bits—say v,v,...v,+. On the other hand, 
if B, decides that 6; = 1, then the whole series of messages must of have 
been 0102...0,« (where v is the compliment of v). 


B,, also has a line tapper, T,, that it can use to test the a;. B, runs 
T, on each a; and obtains 7,,’s opinion as to what each bit was. Call 
this sequence t,t2...t,.. Since C is not GM-secure, each ¢; is correct with 
probability p = $+1/n’, for some fixed 7. By Lemma 1 (with a = 1/2 and 
“n = n*”), if we make k > 27 +1, then the majority of t;s will be correct 
with probability 1 —- O(e~"). B, compares the t; to both the v, and the @,, 
and decides either b, = 0 if the majority of ¢; coincide with the v;, or 6} = 1 
if the majority of the t; coincide with the 0;. Q.E.D. 


4.3  GM-security implies Y-security 


Theorem 3 Let C be a public-key cryptosystem. If C is GM-secure, then 
C ts Y-secure. 


Proof We'll prove the contrapositive. A bird’s eye view of our proof is 
as follows. Assuming that C is not Y-secure, there exists a good shared 
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cyphertext c/d pair that manages to communicate using “few” bits. This 
pair will allow us to test (for some special pair of messages m,; and mz) 
whether a particular @ is the encryption of either m, or mz thus violating 
the GM-security condition. Namely, if the pair works successfully on inputs 
a and mj, we declare a@ to be an encryption of m;; otherwise we declare a 
to be an encryption of m >. 


Let us proceed formally. Since C is not Y-secure, there is a particular 
message space sequence, M = {M,}, such that C is not Y-secure for M. 
That is to say, there exist a shared cyphertext c/d pair AB = ({A,},{B,}), 
a positive integer k, and a polynomial P such that 


(x) A, sommunicates n* messages from M,, to B, using “few” bits per 


message—on top of the cyphertext which they get to share for free. 


(xx) Furthermore, for every c/d pair AB’, there exists an infinite subset 
N' CN, such that for alln € N', on average AB’ uses at least 1/P(n) 
more bits per message than AB does. 


We’re now going to run a series of experiments to see how AB behaves 
on inputs that it doesn’t “expect.” We begin, however, by running a control 
experiment: 


In experiment n-EX Py, we pick n* messages m; at random from M, 
and an & at random from {C(1")|, and run A, on input 


my me ms Paces Myk 
E(m,) E(m2) E(ms) ose E(m,+) 
(The output will be a string $8 such that B,, on inputs @ and 
E(m,),...,E(m,e) will output my,...,m, with overwhelming probabil- 
ity.) 


Now consider the following experiment, n-EX P;: This time we again 
pick n* messages and an FE at random, but we also pick one more message, 
r, at random from M,, and set p = E(r). Now we run A,, with 7 copies 
of p replacing the first 1 cyphertexts in its input, and then run B, on A,,’s 
output. A “picture” of A,,’s input is 


™M, «62 MM; M41 eee Myk 


p .-- p Elmizi) ... E(mys) 
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Definition We define the difference between n-EXP; and n-EXP;, 
d,,(t,7) to be the maximum of the average difference between 


1. the length of the @s output by A, in the two experiments, and 


2. the frequency with which B, recovers the correct plaintexts in the 
two experiments. 


Claim 1: There exists a polynomial Q such that for an infinite subset 
N" CN and for alln € N" d,(0,n*) > 1/Q(n). 


Proof of Claim 1: By contradiction. Assume that for all sufficiently 
large n, n- EX Pp is indistinguishable from n-EX P,:. Then A, and B, still 
function successfully on input 


my, moa Msg ..-. My 
p p p ... p 


where m,,r,p, and E are as above. We now construct Aj, Bi to violate 
(xx). We simply hardwire the encryption of some random string into a 
pair of circuits identical to A, and B, but not sharing cyphertext. By 
assumption, these circuits are a a c/d pair violating (xx). O 


Claim 2: For all n € N", there is a polynomial Q' and an 1,0 <i < 
n*® —1, such that d,(z,7 + 1) > 1/Q'(n). 


Proof of Claim 2: Fix n € N". d,(0,0) = 0 and d,(0,n*) > 1/Q(n). 
Therefore, there must be an i such that d,(t,1+1) > Riau O 


Let n € N". For simplicity, but WLOG, consider the case where 1 = 0 
in Claim 2, and d,(0,1) is due to a difference in the length of A,’s output 
(rather than B,’s success rate). 


Let us restate Claim 2 in a more convenient form. Consider the following 
joint experiment, n-E.X Py;. Randomly draw r,mj,,...,m, from M, and 
set F + C(1"). Run both n-EX Py and n-E XP, on the same inputs. That 
is, run n-E XP on input 
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to compute A,,’s output fp and run n-E XP, on input 


my m2 ms a ee Meyk 


E(r) E(m2z) E(ms) ... E(m,«) 


to compute A,’s output on this input, 6,. The output of n-EX Po, is 
[G1] — |G2|. Then, by the linearity of the average we get that the expected 
value of the output of n-E XP, is at least 1/Q'(n). 


From this it immediately follows that 


(x *x*) there exist 7,771, 1M%2,...,7M%,+ in M, such that the expected value of 
the output of n-E.X Pp; is still greater than 1/Q'(n) when the average 
of the length of 8 is computed only over the choice of EF « C(1") and 
of encryptions of messages. 


Now for all n € N” we can build a tapper T, that will succeed in 
distinguishing two messages m? and m}, described below. 


Fix 7 and 77; to be messages that fit the requirements of (***). We set 
T,’s inputs: mj? = m, and mj =7. T, gets as inputs E € [C(1")], m?, mj, 
and a, where either a € [E(m?)] or a € [E(m})| 


Tr, picks m € {m?,m}} at random and runs A, on input 


m Me Ms eee nk 
a E(m2) E(ms) ... E(t) 


to compute a #8. There is some threshold length value v for the experiment 
described at (* * *) such that if |G] < v it is more likely that a € [E(m)| 
and if |G| > v it is more likely that a € [E(r)]. Thus T,, compares |f| to v 
and outputs its verdict accordingly. Q.E.D. 


Notice that at several points in the proof we took advantage of the fact 
that T, is nonuniform. v is hardwired into T,, as are 7,77,...,M™,s.- In 
fact, most of these uses of nonuniformity could be replaced by polynomial 
size Monte Carlo experiments. However, 7, must be nonuniform since A, 


and B, are nonuniform. 
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One-Pass Scenarios 


= 


In this chapter we present the proper definitions for one-pass cryptogra- 
phy, and then go on to show that these definitions are all equivalent to one 
another. (They are not equivalent to the three-pass definitions.) These 
definitions are all considerably more complicated than the analogous defi- 
nitions for the three-pass scenario. 


5.1 GM-security (1-pass) 


As discussed at the beginning of Chapter 3, for a one-pass cryptosystem, 
we must change from requiring security “for all messages m,” to requiring 
security for every message m that is efficiently computable on input the 
encryption algorithm alone. In order to do this, we introduce an adversary 
called a message finder. 


A message finder is a family of polynomial-size probabilistic circuits 
F = {F,(-)} each of which takes the description of an encryption algorithm 
as its input and has two messages of length n as its output. Intuitively, on 
input E, F,, tries to find mg and m, such that it’s easy for a fellow adversary 
(a line tapper) to distinguish encryptions of my from encryptions of m,. 


Definition Let C be a public-key cryptosystem. C is GM-secure (one- 
pass) if for all message finders F, line tappers T, and c > 0, for all suffi- 
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ciently large n, 


Pr(T,(E, mo, m1, a) =m| E — C(1");mo,m — Fy(E); (5.1) 


IA 
bo] ke 
+ 
= 


m + {mo,mi};a + E(m)) 


5.2 Semantic Security (1-pass) 


To change the definition of semantic security to fit the one-pass scenario, 
we need to introduce something like the message finders of the previous sec- 
tion. For semantic security, however, we’re concerned not with finding two 
“weak” messages, but rather with the probability distribution of the en- 
tire message space. Thus our second adversary will not pick out particular 
messages, but instead set the probability distribution of the message space. 
Furthermore, we now explicitly give the other adversary a description of 
that probability distribution. 


A message space enemy is a family of polynomial-size probabilistic cir- 
cuits B = {B,(-)}. Each B, takes the description of a encryption algorithm 
as its input, and outputs the description of a probabilistic Turing machine 
N({). N outputs elements of {0,1}”" with some probability distribution. 


As in the three-pass definition, we let V be any set and let F = 
{fF :M, > V | E & [C(n)]} be any set of functions. Again set p¥ 
to be the probability of the most probable value for f(m); set p® = 
max{Y meso "(v) Pr,(m) | Uv E V}. 


Definition Let C bea public-key cryptosystem. C is semantically secure 
if for every message space enemy B, family of polynomial-size probabilistic 
circuits A = {A,(-,-,-)}, and c > 0, for all sufficiently large n 


Pr(4,(E,N,a) = f2(m) |B C(1"); N — B,(E); (5.2) 


1 
m+e-N(}; a- E(m)) < Pn + =. 
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5.3  Y-security (1-pass) 


The changes that must be made to the definition of Y-security are com- 
pletely analogous to the changes we made to the definition of semantic 
security. 


5.4 Equivalence 


‘The proofs that the three definitions of security are all equivalent are quite 
similar to the proofs for the three-pass case. Here we will redo only the 
proof of the”easiest of the four implications, semantic security implies GM- 
security. This proof shows the additional details that must be taken into 
consideration when working with the one-pass scenario. 


Theorem 4 GM-security (1-pass), semantic security (1-pass), and Y-se- 
curity (1-pass) are all equivalent. 


Proof that semantic security (1-pass) implies GM-security (1-pass): We 
will, as usual, prove the contrapositive. Let C be a public-key cryptosystem 
_ that is not GM-secure. We know that there exist a message finder family of 
circuits F = {F,} and a line tapper family of circuits T = {T,,}. We will use 
the F, as subroutines (circuit components to be precise) for building our 
message space enemy circuits and then use the TJ, to do the distinguishing. 


Our message space enemy, B,, on input an encryption algorithm EF € 
[C(1")], runs F,, with input E. F, outputs two messages, mo,m, € {0,1}”. 
B,, outputs the design of a Turing machine N() such that 


mg with probability 1/2 

ay SUBpINS m, with probability 1/2 
An adversary A who uses T,, as a subroutine can distinguish encryp- 
tions of mo from encryptions of m,. In other words, on the message space 
defined by the output of N(), A can compute the function f(m) = m (with 
probability greater than at random) given only an encryption of m. A gets 
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E and a@ where either a € [E(rmpo}| or a € [E(m,)]. However, T,, also re- 
quires mg and m, as inputs. A can obtain that information for T, simply 
by running N a few times. 


Formally, since C is not GM-secure we know that there exists a c > 0 
such that for infinitely many n 


Pr(Tr(£,mo,m1, a) =m | EB — C(1");, (5.3) 
mo, Tm, <— F,(£); m {mo,™m1}; 
at E(m)) > ; ee 


For those n, Equation 5.3 in fact says that Pr(A computes f(m) =m 
correctly) > $+n-°. Q.E.D. 


Appendix A 


Proof of Lemma 1 


»- 


In this appendix we provide a proof of Lemma 1. 


Let X be a random variable having a binomial distribution with n trials 
and probability of success p. For 0 < p,a < 1, define 


a l-a 
F(0,0) = alog® + (1 - a) tog (+ =). (A.1) 
Chernoff [2] gave the following upper bound for estimating Pr(X < an). 
_ Theorem 5 (Chernoff) For 0 < a < p, we have Pr(X < an) < e~/(-), 
The following useful fact was pointed out to me by Ravi Boppana [1]. 
Theorem 6 ForO<a<}<p<1, we have f(p,a) > 2(p— a)’. 
Proof We first compute 
SL = log & — log (==) ; (A.2) 
Notice that for all p € [0,1], f(p, p) and SL (p, p) are equal to zero. Taylor’s 


theorem (see, for instance, [9]) states that for any “nice” real function g 
defined on [a, G}, 


32 € [2,6]: 9(8) = Da pee (2) ~ Bp). (A.3) 
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Thus, taking Equation A.3 with n = 2 we see that there is a z € [a, p| such 


that a ; 
fleva) = SA (p,2) oe (A) 


Differentiating f again, we see that 2£(p, i a eT so 


1 2 
= ————~(p — A.5 
f(p, a) aa(1 ~ 2) (p — a) (A.5) 
1 
oe i oO — a)’. A.6 
c= sain, (xa — 5) ee) i) 
The function z+ Pe) is increasing on [1/2,1]. Thus inequality A.6 


still holds if we take z = 1/2 and rewrite the right hand side as 2(p — a)?, 
which completes our proof. Q.E.D. 


Recall that Lemma I states that forO< a<1/2<p< 1, 
Pr(X < an) < e2P-2)’n, (A.7) 


Inequality A.7 follows as an immediate corollary of Theorems 5 and 6. 
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